Running Documentation with Config

This notebook guides you through using the run_documentation_tests() function within the ValidMind Developer Framework. The function takes config as a parameter that enables developers to configure inputs and parameters for individual tests.

As a model developer, configuring individual tests is useful in various models development scenarios. For instance, based on a use case, a model might require changing inputs and/or parameters for certain tests. The run_documentation_tests() function allows you to directly configure tests through config, thus giving you flexibility to run tests according to your use case.

This guide includes the code required to:

  • Load the demo dataset
  • Prepocess the raw dataset
  • Train a model for testing
  • Initialize ValidMind objects
  • Run documentation tests with custom configuration

Install the client library

The client library provides Python support for the ValidMind Developer Framework. To install it:

%pip install -q validmind
WARNING: You are using pip version 22.0.3; however, version 24.0 is available.
You should consider upgrading via the '/Users/andres/code/validmind-sdk/.venv/bin/python3 -m pip install --upgrade pip' command.
Note: you may need to restart the kernel to use updated packages.

Initialize the client library

ValidMind generates a unique code snippet for each registered model to connect with your developer environment. You initialize the client library with this code snippet, which ensures that your documentation and tests are uploaded to the correct model when you run the notebook.

Get your code snippet:

  1. In a browser, log into the Platform UI.

  2. In the left sidebar, navigate to Model Inventory and click + Register new model.

  3. Enter the model details and click Continue. (Need more help?)

    For example, to register a model for use with this notebook, select:

    • Documentation template: Binary classification
    • Use case: Marketing/Sales - Attrition/Churn Management

    You can fill in other options according to your preference.

  4. Go to Getting Started and click Copy snippet to clipboard.

Next, replace this placeholder with your own code snippet:

# Replace with your code snippet

import validmind as vm

vm.init(
    api_host="https://api.prod.validmind.ai/api/v1/tracking",
    api_key="...",
    api_secret="...",
    project="...",
)
2024-04-10 17:31:12,908 - INFO(validmind.api_client): Connected to ValidMind. Project: [Int. Tests] Customer Churn - Initial Validation (cltnl29bz00051omgwepjgu1r)

Preview the documentation template

A template predefines sections for your documentation project and provides a general outline to follow, making the documentation process much easier.

You will upload documentation and test results into this template later on. For now, take a look at the structure that the template provides with the vm.preview_template() function from the ValidMind library and note the empty sections:

vm.preview_template()

Load the sample dataset

The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas DataFrame, a two-dimensional tabular data structure that makes use of rows and columns:

# Import the sample dataset from the library

from validmind.datasets.classification import customer_churn as demo_dataset

print(
    f"Loaded demo dataset with: \n\n\t• Target column: '{demo_dataset.target_column}' \n\t• Class labels: {demo_dataset.class_labels}"
)

raw_df = demo_dataset.load_data()
raw_df.head()
Loaded demo dataset with: 

    • Target column: 'Exited' 
    • Class labels: {'0': 'Did not exit', '1': 'Exited'}
CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
0 619 France Female 42 2 0.00 1 1 1 101348.88 1
1 608 Spain Female 41 1 83807.86 1 0 1 112542.58 0
2 502 France Female 42 8 159660.80 3 1 0 113931.57 1
3 699 France Female 39 1 0.00 2 0 0 93826.63 0
4 850 Spain Female 43 2 125510.82 1 1 1 79084.10 0

Document the model

As part of documenting the model with the ValidMind Developer Framework, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests.

Prepocess the raw dataset

Preprocessing performs a number of operations to get ready for the subsequent steps:

  • Preprocess the data: Splits the DataFrame (df) into multiple datasets (train_df, validation_df, and test_df) using demo_dataset.preprocess to simplify preprocessing.
  • Separate features and targets: Drops the target column to create feature sets (x_train, x_val) and target sets (y_train, y_val).
  • Initialize XGBoost classifier: Creates an XGBClassifier object with early stopping rounds set to 10.
  • Set evaluation metrics: Specifies metrics for model evaluation as “error,” “logloss,” and “auc.”
  • Fit the model: Trains the model on x_train and y_train using the validation set (x_val, y_val). Verbose output is disabled.
train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)

Train a model for testing

We train a simple customer churn model for our test.

import xgboost
%matplotlib inline

x_train = train_df.drop(demo_dataset.target_column, axis=1)
y_train = train_df[demo_dataset.target_column]
x_val = validation_df.drop(demo_dataset.target_column, axis=1)
y_val = validation_df[demo_dataset.target_column]

xgb = xgboost.XGBClassifier(early_stopping_rounds=10)
xgb.set_params(
    eval_metric=["error", "logloss", "auc"],
)
xgb.fit(
    x_train,
    y_train,
    eval_set=[(x_val, y_val)],
    verbose=False,
)
XGBClassifier(base_score=None, booster=None, callbacks=None,
              colsample_bylevel=None, colsample_bynode=None,
              colsample_bytree=None, early_stopping_rounds=10,
              enable_categorical=False, eval_metric=['error', 'logloss', 'auc'],
              feature_types=None, gamma=None, gpu_id=None, grow_policy=None,
              importance_type=None, interaction_constraints=None,
              learning_rate=None, max_bin=None, max_cat_threshold=None,
              max_cat_to_onehot=None, max_delta_step=None, max_depth=None,
              max_leaves=None, min_child_weight=None, missing=nan,
              monotone_constraints=None, n_estimators=100, n_jobs=None,
              num_parallel_tree=None, predictor=None, random_state=None, ...)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Initialize ValidMind objects

Initialize ValidMind model object

Before you can run tests, you must first initialize a ValidMind model object using the init_model function from the ValidMind (vm) module.

This function takes a number of arguments:

  • model — the model that you want to provide as input to tests
  • input_id - a unique identifier that allows tracking what inputs are used when running each individual test
vm_model_xgb = vm.init_model(
    xgb,
    input_id="xgb",
)

Initialize the ValidMind datasets

Similarly, initialize a ValidMind dataset object using the init_dataset function from the ValidMind (vm) module.

This function takes a number of arguments:

  • dataset — the raw dataset that you want to provide as input to tests
  • input_id - a unique identifier that allows tracking what inputs are used when running each individual test
  • target_column — a required argument if tests require access to true values. This is the name of the target column in the dataset
  • class_labels — an optional value to map predicted classes to class labels

With all datasets ready, you can now initialize the raw, training and test datasets (raw_df, train_df and test_df) created earlier into their own dataset objects using vm.init_dataset():

vm_raw_ds = vm.init_dataset(
    input_id="raw_dataset",
    dataset=raw_df,
    target_column=demo_dataset.target_column,
)

feature_columns = [
    "CreditScore",
    "Gender",
    "Age",
    "Tenure",
    "Balance",
    "NumOfProducts",
    "HasCrCard",
    "IsActiveMember",
    "EstimatedSalary",
    "Geography_France",
    "Geography_Germany",
    "Geography_Spain",
]

vm_train_ds = vm.init_dataset(
    input_id="train_dataset",
    dataset=train_df,
    target_column=demo_dataset.target_column,
    feature_columns=feature_columns,
)

vm_test_ds = vm.init_dataset(
    input_id="test_dataset",
    dataset=test_df,
    target_column=demo_dataset.target_column,
    feature_columns=feature_columns,
)
2024-04-10 17:31:14,442 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...
2024-04-10 17:31:14,503 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...
2024-04-10 17:31:14,555 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...

Run predictions through assign_predictions interface

We can use assign_predictions() to run and assign model predictions to our training and test datasets:

vm_train_ds.assign_predictions(model=vm_model_xgb)
vm_test_ds.assign_predictions(model=vm_model_xgb)
2024-04-10 17:31:14,609 - INFO(validmind.vm_models.dataset): Running predict()... This may take a while
2024-04-10 17:31:14,611 - INFO(validmind.vm_models.dataset): Running predict()... This may take a while

Run documentation tests with custom configuration

Preview config

You can preview the default config for the documentation template using the vm.get_test_suite().get_default_config() interface.

import json

project_test_suite = vm.get_test_suite()
config = project_test_suite.get_default_config()
print("Suite Config: \n", json.dumps(config, indent=2))
Suite Config: 
 {
  "validmind.data_validation.DatasetDescription": {
    "inputs": {
      "dataset": null
    },
    "params": {}
  },
  "validmind.data_validation.ClassImbalance": {
    "inputs": {
      "dataset": null
    },
    "params": {
      "min_percent_threshold": 10
    }
  },
  "validmind.data_validation.Duplicates": {
    "inputs": {
      "dataset": null
    },
    "params": {
      "min_threshold": 1
    }
  },
  "validmind.data_validation.HighCardinality": {
    "inputs": {
      "dataset": null
    },
    "params": {
      "num_threshold": 100,
      "percent_threshold": 0.1,
      "threshold_type": "percent"
    }
  },
  "validmind.data_validation.MissingValues": {
    "inputs": {
      "dataset": null
    },
    "params": {
      "min_threshold": 1
    }
  },
  "validmind.data_validation.Skewness": {
    "inputs": {
      "dataset": null
    },
    "params": {
      "max_threshold": 1
    }
  },
  "validmind.data_validation.UniqueRows": {
    "inputs": {
      "dataset": null
    },
    "params": {
      "min_percent_threshold": 1
    }
  },
  "validmind.data_validation.TooManyZeroValues": {
    "inputs": {
      "dataset": null
    },
    "params": {
      "max_percent_threshold": 0.03
    }
  },
  "validmind.data_validation.IQROutliersTable": {
    "inputs": {
      "dataset": null
    },
    "params": {
      "features": null,
      "threshold": 1.5
    }
  },
  "validmind.data_validation.IQROutliersBarPlot": {
    "inputs": {
      "dataset": null
    },
    "params": {
      "threshold": 1.5,
      "num_features": null,
      "fig_width": 800
    }
  },
  "validmind.data_validation.DescriptiveStatistics": {
    "inputs": {
      "dataset": null
    },
    "params": {}
  },
  "validmind.data_validation.PearsonCorrelationMatrix": {
    "inputs": {
      "dataset": null
    },
    "params": {}
  },
  "validmind.data_validation.HighPearsonCorrelation": {
    "inputs": {
      "dataset": null
    },
    "params": {
      "max_threshold": 0.3
    }
  },
  "validmind.model_validation.ModelMetadata": {
    "inputs": {
      "model": null
    },
    "params": {}
  },
  "validmind.data_validation.DatasetSplit": {
    "inputs": {
      "datasets": null
    },
    "params": {}
  },
  "validmind.model_validation.sklearn.PopulationStabilityIndex": {
    "inputs": {
      "model": null,
      "datasets": null
    },
    "params": {
      "num_bins": 10,
      "mode": "fixed"
    }
  },
  "validmind.model_validation.sklearn.ConfusionMatrix": {
    "inputs": {
      "model": null,
      "dataset": null
    },
    "params": {}
  },
  "validmind.model_validation.sklearn.ClassifierPerformance:in_sample": {
    "inputs": {
      "model": null,
      "dataset": null
    },
    "params": {}
  },
  "validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample": {
    "inputs": {
      "model": null,
      "dataset": null
    },
    "params": {}
  },
  "validmind.model_validation.sklearn.PrecisionRecallCurve": {
    "inputs": {
      "model": null,
      "dataset": null
    },
    "params": {}
  },
  "validmind.model_validation.sklearn.ROCCurve": {
    "inputs": {
      "model": null,
      "dataset": null
    },
    "params": {}
  },
  "validmind.model_validation.sklearn.TrainingTestDegradation": {
    "inputs": {
      "model": null,
      "datasets": null
    },
    "params": {
      "metrics": [
        "accuracy",
        "precision",
        "recall",
        "f1"
      ],
      "max_threshold": 0.1
    }
  },
  "validmind.model_validation.sklearn.MinimumAccuracy": {
    "inputs": {
      "model": null,
      "dataset": null
    },
    "params": {
      "min_threshold": 0.7
    }
  },
  "validmind.model_validation.sklearn.MinimumF1Score": {
    "inputs": {
      "model": null,
      "dataset": null
    },
    "params": {
      "min_threshold": 0.5
    }
  },
  "validmind.model_validation.sklearn.MinimumROCAUCScore": {
    "inputs": {
      "model": null,
      "dataset": null
    },
    "params": {
      "min_threshold": 0.5
    }
  },
  "validmind.model_validation.sklearn.PermutationFeatureImportance": {
    "inputs": {
      "model": null,
      "dataset": null
    },
    "params": {
      "fontsize": null,
      "figure_height": 1000
    }
  },
  "validmind.model_validation.sklearn.SHAPGlobalImportance": {
    "inputs": {
      "model": null,
      "dataset": null
    },
    "params": {
      "kernel_explainer_samples": 10
    }
  },
  "validmind.model_validation.sklearn.WeakspotsDiagnosis": {
    "inputs": {
      "model": null,
      "datasets": null
    },
    "params": {
      "features_columns": null,
      "thresholds": {
        "accuracy": 0.75,
        "precision": 0.5,
        "recall": 0.5,
        "f1": 0.7
      }
    }
  },
  "validmind.model_validation.sklearn.OverfitDiagnosis": {
    "inputs": {
      "model": null,
      "datasets": null
    },
    "params": {
      "features_columns": null,
      "cut_off_percentage": 4
    }
  },
  "validmind.model_validation.sklearn.RobustnessDiagnosis": {
    "inputs": {
      "model": null,
      "datasets": null
    },
    "params": {
      "features_columns": null,
      "scaling_factor_std_dev_list": [
        0.0,
        0.1,
        0.2,
        0.3,
        0.4,
        0.5
      ],
      "accuracy_decay_threshold": 4
    }
  }
}

Updating config

The test configuration can be updated to fit with your use case and requirements

config = {
    "validmind.data_validation.DatasetSplit": {
        "inputs": {"datasets": (vm_train_ds, vm_test_ds)},
    },
    "validmind.model_validation.sklearn.PopulationStabilityIndex": {
        "inputs": {"model": vm_model_xgb, "datasets": (vm_train_ds, vm_test_ds)},
    },
    "validmind.model_validation.sklearn.ConfusionMatrix": {
        "inputs": {"model": vm_model_xgb, "dataset": vm_test_ds},
    },
    "validmind.model_validation.sklearn.ClassifierPerformance:in_sample": {
        "inputs": {"model": vm_model_xgb, "dataset": vm_train_ds},
    },
    "validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample": {
        "inputs": {"model": vm_model_xgb, "dataset": vm_test_ds},
    },
    "validmind.model_validation.sklearn.PrecisionRecallCurve": {
        "inputs": {"model": vm_model_xgb, "dataset": vm_test_ds},
    },
    "validmind.model_validation.sklearn.ROCCurve": {
        "inputs": {"model": vm_model_xgb, "dataset": vm_test_ds},
    },
    "validmind.model_validation.sklearn.TrainingTestDegradation": {
        "inputs": {"model": vm_model_xgb, "datasets": (vm_train_ds, vm_test_ds)},
    },
    "validmind.model_validation.sklearn.MinimumAccuracy": {
        "inputs": {"model": vm_model_xgb, "dataset": vm_test_ds},
    },
    "validmind.model_validation.sklearn.MinimumF1Score": {
        "inputs": {"model": vm_model_xgb, "dataset": vm_test_ds},
    },
    "validmind.model_validation.sklearn.MinimumROCAUCScore": {
        "inputs": {"model": vm_model_xgb, "dataset": vm_test_ds},
    },
    "validmind.model_validation.sklearn.PermutationFeatureImportance": {
        "inputs": {"model": vm_model_xgb, "dataset": vm_test_ds},
    },
    "validmind.model_validation.sklearn.SHAPGlobalImportance": {
        "inputs": {"model": vm_model_xgb, "dataset": vm_test_ds},
    },
    "validmind.model_validation.sklearn.WeakspotsDiagnosis": {
        "inputs": {"model": vm_model_xgb, "datasets": (vm_train_ds, vm_test_ds)},
    },
    "validmind.model_validation.sklearn.OverfitDiagnosis": {
        "inputs": {"model": vm_model_xgb, "datasets": (vm_train_ds, vm_test_ds)},
    },
    "validmind.model_validation.sklearn.RobustnessDiagnosis": {
        "inputs": {"model": vm_model_xgb, "datasets": (vm_train_ds, vm_test_ds)},
    },
}

Run documentation tests

You can now run all documentation tests and pass an extra config parameter that overrides input and parameter configuration for the tests specified in the object.

full_suite = vm.run_documentation_tests(
    inputs={
        "dataset": vm_raw_ds,
        "model": vm_model_xgb,
    },
    config=config,
)